Automatic Speech Recognition for Indoor HRI Scenarios

نویسندگان

چکیده

This article presents a stand-alone automatic speech recognition system that accounts for listener movement, time-varying reverberation effects, environmental noise, and user position information beamforming approaches in an HRI setting. We raise the importance of replacing classical black-box integration technology applications with incorporation acoustic environment representation modeling, target source direction. Test data were recorded on real robot under various moving conditions. For addressing channel problem incorporating effect during training, clean samples passed through estimated static responses noise was added. Beamforming is investigated regarding oracle tracking using, instance, image processing. The proposed strategy interesting robotics community, because it allows development voice-based limited training without relying third-party technologies or Internet access eliminating need to upload cloud. In our mobile scenario, resulting engine provided average word error rate at least 19% 34% lower than publicly available APIs playback (i.e., loudspeaker) human testing modalities, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Automatic speech recognition for children

In this paper, the acoustic and linguistic characteristics of children speech are investigated in the context of automatic speech recognition. Acoustic variability is identi ed as a major hurdle in building high performance ASR applications for children. A simple speaker normalization algorithm combining frequency warping and spectral shaping introduced in [5] is shown to reduce acoustic variab...

متن کامل

Revisiting scenarios and methods for variable frame rate analysis in automatic speech recognition

In this paper we present a revision and evaluation of some of the main methods used in variable frame rate (VFR) analysis, applied to speech recognition systems. The work found in the literature in this area usually deals with restricted conditions and scenarios and we have revisited the main algorithmic alternatives and evaluated them under the same experimental framework, so that we have been...

متن کامل

Automatic speech recognition

متن کامل

Robust speech recognition in client-server scenarios

This paper addresses issues that are specific to the implementation of automatic speech recognition (ASR) applications and services in client-server scenarios. It is assumed in all of these scenarios that functionality in a human-machine dialog system is distributed between mobile client devices and network based multi-user media and application servers. It is argued that, while there has alrea...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM transactions on human-robot interaction

سال: 2021

ISSN: ['2573-9522']

DOI: https://doi.org/10.1145/3442629